A comparative study of divisive hierarchical clustering algorithms
نویسنده
چکیده
A general scheme for divisive hierarchical clustering algorithms is proposed. It is made of three main steps : first a splitting procedure for the subdivision of clusters into two subclusters, second a local evaluation of the bipartitions resulting from the tentative splits and, third, a formula for determining the nodes levels of the resulting dendrogram. A handfull of such algorithms is given. These algorithms are compared using the GoodmanKruskal correlation coefficient. As a global criterion it is an internal goodness-of-fit measure based on the set order induced by the hierarchy compared to the order associated to the given dissimilarities. Applied to a hundred of random data tables, these comparisons are in favor of two methods based on non-usual ratio-type formulas for the splitting procedures, namely the Silhouette criterion and the Dunn's criterion. These two criteria take into account both the within cluster and the between cluster mean dissimilarity. In general the results of these two algorithms are better than the classical Agglomerative Average Link method.
منابع مشابه
Hierarchical Clustering Algorithm - A Comparative Study
Clustering is a data mining (machine learning) technique used to place data elements into related groups without advance knowledge on the group definitions. In this paper the authors provides an in depth explanation of implementation of agglomerative and divisive clustering algorithms for various types of attributes. Database-the details of the victims of Tsunami in
متن کاملOn the performance of bisecting K - means and PDDP * Sergio
problem is known as bisecting divisive clustering. Note that by recursively using a divisive bisecting clustering procedure, the dataset can be partitioned into any given number of clusters. Interestingly enough, the clusters so-obtained are structured as a hierarchical binary tree (or a binary taxonomy). This is the reason why the bisecting divisive approach is very attractive in many applicat...
متن کاملAgglomerative and Divisive Approaches to Unsupervised Learning in Gestalt Clusters
Hierarchical clustering algorithms can be agglomerative or divisive, depending on how partitions are formed. Such algorithms have advantages mainly related to the desired level of granularity the partition should have. The work described in this paper approaches two hierarchical algorithms, one agglomerative (and three of its variants) and the other divisive, focusing on their performance in un...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملApproximation Bounds for Hierarchical Clustering: Average Linkage, Bisecting K-means, and Local Search
Hierarchical clustering is a data analysis method that has been used for decades. Despite its widespread use, the method has an underdeveloped analytical foundation. Having a well understood foundation would both support the currently used methods and help guide future improvements. The goal of this paper is to give an analytic framework to better understand observations seen in practice. This ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1506.08977 شماره
صفحات -
تاریخ انتشار 2015